reprocess failed account events #6571

escattone · 2025-03-18T20:54:28Z

Notes

This work introduces a periodic cron task (runs every 4 hours) that gathers all unprocessed account events within the last 24 hours and queues each of them for re-processing. So apart from the Celery retry mechanism on exceptions -- which I reduced from 4 retries to 3 retries, and occurs over the course of about 15 seconds -- the processing of failed (unprocessed) account event tasks will be attempted 5-6 times over the course of the 24-hour period starting from the moment of their creation. Account events that remain in the unprocessed state after 24 hours will no longer be re-processed.

I've also added DMS Snitches for stage and prod, as well as defined their URL's for settings.DMS_REPROCESS_FAILED_ACCOUNT_EVENTS in our GCP secrets.

As part of this work, I made the following changes to each of the tasks within kitsune.users.tasks:

Removed the @skip_if_read_only_mode decorators. They were useless even before we removed the Celery workers from the failover clusters -- because they would still "steal" events -- but even more so now that the Celery workers in the failover cluster have been removed.
Added the @transaction.atomic decorators, so that all of the DB changes for each task are handled as an atomic chunk. If any one of them fail, all of the others are rolled back. This is to prevent failures from leaving behind a residue of partially-completed work, so that retries always start with a clean slate.
When fetching the account event at the beginning of each task, skip any further work if the event no longer exists or it has already been processed. This was done to ensure robust handling in the case of multiple Celery tasks queued for the same event, which could happen given the new periodic cron task that re-processes failed account events.

Copilot

Pull Request Overview

This PR introduces a periodic cron task to reprocess failed account events within the past 24 hours. It reduces the Celery retry count, adds transaction atomicity to task processing, and updates tests and the management command accordingly.

Adds a new cron job in scripts/cron.py to trigger the reprocessing command every 4 hours.
Removes the skip_if_read_only_mode decorators and wraps task functions in transaction.atomic.
Updates tests to verify the atomicity of delete, subscription state change, and password change events.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
scripts/cron.py	Adds a new scheduled job for reprocessing failed account events.
kitsune/users/tests/test_tasks.py	Introduces tests to ensure atomicity for various account event tasks.
kitsune/users/tasks.py	Removes read-only mode checks, adds transaction.atomic, and adjusts retry count.
kitsune/users/management/commands/reprocess_failed_account_events.py	Updates command help text and argument naming to reflect hours instead of days.
kitsune/settings.py	Introduces a new settings variable for DMS reprocessing.

escattone requested a review from akatsoulas March 19, 2025 02:31

escattone force-pushed the reprocess-failed-account-event-tasks branch from ff10628 to 0d4eacf Compare March 24, 2025 21:29

akatsoulas requested a review from Copilot March 31, 2025 10:05

Copilot AI reviewed Mar 31, 2025

View reviewed changes

reprocess failed account events

cb917ad

escattone force-pushed the reprocess-failed-account-event-tasks branch from 0d4eacf to cb917ad Compare March 31, 2025 16:55

akatsoulas approved these changes Apr 1, 2025

View reviewed changes

escattone merged commit 09ebaa3 into mozilla:main Apr 1, 2025
2 checks passed

escattone deleted the reprocess-failed-account-event-tasks branch April 1, 2025 16:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reprocess failed account events #6571

reprocess failed account events #6571

Uh oh!

escattone commented Mar 18, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

reprocess failed account events #6571

reprocess failed account events #6571

Uh oh!

Conversation

escattone commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

escattone commented Mar 18, 2025 •

edited

Loading